639 research outputs found
On The Stability of Interpretable Models
Interpretable classification models are built with the purpose of providing a
comprehensible description of the decision logic to an external oversight
agent. When considered in isolation, a decision tree, a set of classification
rules, or a linear model, are widely recognized as human-interpretable.
However, such models are generated as part of a larger analytical process. Bias
in data collection and preparation, or in model's construction may severely
affect the accountability of the design process. We conduct an experimental
study of the stability of interpretable models with respect to feature
selection, instance selection, and model selection. Our conclusions should
raise awareness and attention of the scientific community on the need of a
stability impact assessment of interpretable models
Classes of Terminating Logic Programs
Termination of logic programs depends critically on the selection rule, i.e.
the rule that determines which atom is selected in each resolution step. In
this article, we classify programs (and queries) according to the selection
rules for which they terminate. This is a survey and unified view on different
approaches in the literature. For each class, we present a sufficient, for most
classes even necessary, criterion for determining that a program is in that
class. We study six classes: a program strongly terminates if it terminates for
all selection rules; a program input terminates if it terminates for selection
rules which only select atoms that are sufficiently instantiated in their input
positions, so that these arguments do not get instantiated any further by the
unification; a program local delay terminates if it terminates for local
selection rules which only select atoms that are bounded w.r.t. an appropriate
level mapping; a program left-terminates if it terminates for the usual
left-to-right selection rule; a program exists-terminates if there exists a
selection rule for which it terminates; finally, a program has bounded
nondeterminism if it only has finitely many refutations. We propose a
semantics-preserving transformation from programs with bounded nondeterminism
into strongly terminating programs. Moreover, by unifying different formalisms
and making appropriate assumptions, we are able to establish a formal hierarchy
between the different classes.Comment: 50 pages. The following mistake was corrected: In figure 5, the first
clause for insert was insert([],X,[X]
Variable Ranges in Linear Constraints
We introduce an extension of linear constraints, called linearrange constraints, which allows for (meta-)reasoning about the approximation width of variables. Semantics for linearrange constraints is provided in terms of parameterized linear systems. We devise procedures for checking satisfiability and for entailing the maximal width of a variable. An extension of the constraint logic programming language CLP(R) is proposed by admitting linear-range constraints
A KDD process for discrimination discovery
The acceptance of analytical methods for discrimination discovery by practitioners and legal scholars can be only achieved if the data mining and machine learning communities will be able to provide case studies, methodological refinements, and the consolidation of a KDD process. We summarize here an approach along these directions
A Relational Approach to Networks in a Tourism Destination: Business and Family Networks in San Vito Lo Capo, Sicily
This article constructs a relational framework using the principles of the Network Approach to examining the business exchange structure of a tourist destination. Network Analysis is the methodology to analyse the metrics of collaboration and cooperation among destination companies. The model was applied in a remote tourist destination named San Vito Lo Capo on the island of Sicily, where tourism has significantly expanded in the last twenty years. The focus is on how groupings of small companies within family relations can govern and be responsible for tourism destination cooperation. As the main result, the existence was identified, of a relational framework where three clusters of families with a high density of exchanges emerge. These families can influence the tourism business at the destination, guaranteeing cooperation among other business companies. The findings show the existence and the importance of informal business networks and the contribution of Network Analysis to understanding the structure and cohesiveness of a tourist destination
The Initial Screening Order Problem
In this paper we present the initial screening order problem, a crucial step
within candidate screening. It involves a human-like screener with an objective
to find the first k suitable candidates rather than the best k suitable
candidates in a candidate pool given an initial screening order. The initial
screening order represents the way in which the human-like screener arranges
the candidate pool prior to screening. The choice of initial screening order
has considerable effects on the selected set of k candidates. We prove that
under an unbalanced candidate pool (e.g., having more male than female
candidates), the human-like screener can suffer from uneven efforts that hinder
its decision-making over the protected, under-represented group relative to the
non-protected, over-represented group. Other fairness results are proven under
the human-like screener. This research is based on a collaboration with a large
company to better understand its hiring process for potential automation. Our
main contribution is the formalization of the initial screening order problem
which, we argue, opens the path for future extensions of the current works on
ranking algorithms, fairness, and automation for screening procedures
Data Mining for Discrimination Discovery
In the context of civil rights law, discrimination refers to unfair or unequal treatment of people based on membership to a category or a minority, without regard to individual merit. Discrimination in credit, mortgage, insurance, labor market, and education has been investigated by researchers in economics and human sciences. With the advent of automatic decision support systems, such as credit scoring systems, the ease of data collection opens several challenges to data analysts for the fight against discrimination. In this paper, we introduce the problem of discovering discrimination through data mining in a dataset of historical decision records, taken by humans or by automatic systems. We formalize the processes of direct and indirect discrimination discovery by modelling protected-by-law groups and contexts where discrimination occurs in a classification rule based syntax. Basically, classification rules extracted from the dataset allow for unveiling contexts of unlawful discrimination, where the degree of burden over protected-bylaw groups is formalized by an extension of the lift measure of a classification rule. In direct discrimination, the extracted rules can be directly mined in search of discriminatory contexts. In indirect discrimination, the mining process needs some background knowledge as a further input, e.g., census data, that combined with the extracted rules might allow for unveiling contexts of discriminatory decisions. A strategy adopted for combining extracted classification rules with background knowledge is called an inference model. In this paper, we propose two inference models and provide automatic procedures for their implementation. An empirical assessment of our results is provided on the German credit dataset and on the PKDD Discovery Challenge 1999 financial dataset
- …